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Abstract 

We study the Monadic Second Order (MSO) Hierarchy over colourings of the discrete plane, and draw links between 
classes of formula and classes of subshifts. We give a characterization of existential MSO in terms of projections 
of tilings, and of universal sentences in terms of combinations of "pattern counting" subshifts. Conversely, we char- 
acterise logic fragments corresponding to various classes of subshifts (subshifts of finite type, sofic subshifts, all 
subshifts). Finally, we show by a separation result how the situation here is different from the case of tiling pictures 
studied earlier by Giammarresi et al. 
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1. Introduction 

There is a close connection between words and monadic second-order (MSO) logic. Biichi and Elgot proved for 
finite words that MSO-formulas correspond exactly to regular languages. This relationship was developed for other 
classes of labeled graphs; trees or infinite words enjoy a similar connection. See [1,2] for a survey of existing results. 
Colorings of the entire plane, i.e tilings, represent a natural generalization of biinfinite words to higher dimensions, and 
as such enjoy similar properties. We plan to study in this paper tilings for the point of view of monadic second-order 
logic. 

Tilings and logic have a shared history. The introduction of tilings can be traced back to Hao Wang [3], who 
introduced his celebrated tiles to study the (un)decidability of the V3V fragment of first order logic. The undecidability 
of the domino problem by his PhD Student Berger [4] lead then to the undecidability of this fragment [5]. Seese [6, 7] 
used the domino problem to prove that graphs with a decidable MSO theory have a bounded tree width. Makowsky[8, 
9] used the construction by Robinson [10] to give the first example of a finitely axiomatizable super-stable theory that 
is super-stable. More recently, Oger [11] gave generalizations of classical results on tilings to locally finite relational 
structures. See the survey [12] for more details. 

Previously, a finite variant of tilings, called tiling pictures, was studied [13, 14]. Tiling pictures correspond to 
colorings of a. finite region of the plane, this region being bordered by special '#' symbols. It is proven for this 
particular model that language recognized by EMSO-formulas correspond exactly to so-called finite tiling systems, 
i.e. projections of finite tilings. 

The equivalent of finite tiling systems for infinite pictures are so-called sofic subshifts [15]. A sofic subshift repre- 
sents intuitively local properties and ensures that every point of the plane behaves in the same way. As a consequence, 
there is no general way to enforce that some specific color, say Q appears at least once. Hence, some simple first- 
order existential formulas have no equivalent as sofic subshift (and even subshift). This is where the border of # for 
finite pictures play an important role: Without such a border, results on finite pictures would also stumble on this 
issue. See [16] for similar results on finite pictures without borders. 
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We deal primarily in this article with subshifts. See [17] for other acceptance conditions (what we called subshifts 
of finite type correspond to A-acceptance in this paper). 

Finally, note that all decision problems in our context are non-trivial : To decide if a universal first-order formula is 
satisfiable (the domino problem, presented earlier) is not recursive. Worse, it is S[-hard to decide if a tiling of the plane 
exists where some given color appears infinitely often [18, 17]. As a consequence, the satisfiability of MSO-formulas 
is at least S|-hard. 

In this paper, we will prove how various classes of formula correspond to well known classes of subshifts. Some 
of the results of this paper were already presented in [19]. 



2. Symbolic Spaces and Logic 

2.1. Configurations 

Consider the discrete lattice I 2 . For any finite set Q, a ^-configuration is a function from Z 2 to Q. Q may be seen 
as a set of colors or states. An element of I? will be called a cell. A configuration will usually be denoted C, M or N. 

Fig. 1 shows an example of two different configurations of Z 2 over a set Q of 5 colors. As a configuration is 
infinite, only a finite fragment of the configurations is represented in the figure. The reader has to use his imagination 
to decide what colors do appear in the rest of the configuration. We choose not to represent which cell of the picture 
is the origin (0, 0). This will indeed be of no importance as we use only translation invariant properties. 

For any z e Z 2 we denote by <x z the shift map of vector z, i.e. the function from g- con fig ura ti ons to Q- 
configurations such that for all C e Q 1 ' : 

Vz' € l\ cr : (C)(z') = C(z' - z). 



M 



N 





Figure 1 : Two configurations 

A pattern is a partial configuration. A pattern P : X — > Q where X Ql? occurs in C e Q^ 2 at position zo if 

Vz e X, C(zo + z) = P(z). 

We say that P occurs in C if it occurs at some position in C. As an example the pattern P of Fig 2 occurs in the 
configuration M but not in N (or more accurately not on the finite fragment of N depicted in the figure). A finite 
pattern is a partial configuration of finite domain. All patterns in the following will be finite. The language -C(C) of a 
configuration C is the set of finite patterns that occur in C. We naturally extend this notion to sets of configurations. 

A subshift is a natural concept that captures both the notion of uniformity and locality: the only description 
"available" from a configuration C is the finite patterns it contains, that is ISC). Given a set T of patterns, let Xf be 
the set of all configurations where no patterns of T occurs. 



X r = {C\£{C) n T = 0) 



f is usually called the set of forbidden patterns or the forbidden language. A set of the form Xf is called a subshift. 
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Figure 2: A pattern P. P appears in M but presumably not in N 
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Figure 3: A (finite) set of forbidden patterns T and the tilings it generates 



A subshift can be equivalentely defined by topology considerations. Endow the set of configurations Q with 
the product topology: A sequence (C„) n eN of configurations converges to a configuration C if the sequence ultimately 
agree with C on every z e Z 2 . Then a subshift is a closed subset of g Z also closed by shift maps. 

Example 1. Consider the three forbidden patterns of figure 3. The first one says that we cannot find a j£j point at the 
left of a Q point. This can be interpreted as follows: every time we find a J point, then all the points at the right of 
it are also J. With the second forbidden pattern, we deduce that every time we find a £ point, then the entire quarter 
of plane on the above right of it is also filled with J points. The third pattern ensures us that every configuration 
contains at most one quarter of plane of color J : if it contains two such quarters of plane, then there must be a bigger 
quarter of plane that contains both. 

Hence a typical configuration looks like A. Other possible configurations are B, C, D, E. They correspond to 
extremal situations where the corner of the quarter of plane is situated respectively at (0, -oo), (-co, 0), (-co, -co) et 
(+co, +co) 

Example 2. Consider the set of colors Q|J and T to be the set of patterns that contains two J points or more. 

Then Xf contains configurations with at most one | point. Up to shift, Xyr contains then two configurations: the 
all Q-one, and one where only one point is £ and all others are Q 
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A subshift of finite type (or tiling) correspond to a finite set T: it is the set of configurations C such that no pattern 
in f occurs in C. If all patterns of T are of diameter n, this means that we only have to see a configuration through 
a window of size n to know if it is a tiling, hence the locality. Example 1 is a subshift of finite type. It can be proven 
that Example 2 is not. 

Given two state sets Q\ and Q2, a projection is a map n : Q\ — > Qi. We naturally extend it to n : Qf~ — ♦ Q? 
by ji{C)(z) = n(C{z)). A sofic subshift of state set Q2 is the image by some projection n of some subshift of finite 
type of state set Q\. It is also a subshift (clearly closed by shift maps, and topologically closed because projections 
are continuous maps on a compact space). A sofic subshift is a natural object in tiling theory, although quite never 
mentioned explicitly. It represents the concept of decoration: some of the tiles we assemble to obtain the tilings may 
be decorated, but we forgot the decoration when we observe the tiling. 

Example 3. Consider the following variant of Example 1 : tilings are exactly the same except that the corner of the 
quarter of plane in A is of a different color It is easy to see that this variant defines a subshift of finite type X (with 
a few more forbidden patterns). 

Now consider the following map: 

□ -» □ 

□ - ■ 

Then B, C, D, E will become under n of color Q while A will become a configuration with exactly one J. all other 
points being 

As a consequence, n(X) is exactly Example 2. Example 2 is thus a sofic subshift. 
2.2. Structures 

A configuration will be seen in this article as an infinite structure. The signature t contains four unary maps North, 
South, East, West and a predicate P c for each color c e Q. 

A configuration M will be seen as a structure SOT in the following way: 

• The elements of 9JI are the points of Z 2 . 

• North is interpreted by North a7t ((jc,)')) = (x,y + 1), East is interpreted by East OT ((jc,y)) = (x + l,y). South OT 
and West m are interpreted similarly 

• Pc(( x >yy) i s tme ^ an d on ly ^ trie P omt at coordinate (x,y) is of color c, that is if M(x,y) = c. 

As an example, the configuration M of Fig. 1 has three consecutive cells with the color That is, the following 
formula is true: 

3D! N 3z,P n (z) A ^(Eastfe)) A P n (East(East(z))) 

As another example, the following formula states that the configuration has a vertical period of 2 (the color in the 
cell (x, y) is the same as the color in the cell (x, y + 2)). The formula is false in the structure 9JI and true in the structure 
91 (if the reader chose to color the cells of jV not shown in the picture correctly): 

' P U ( Z ) => P B (North(North(z))) 

P n (z) => J P n (North(North(z))) 

; { P u (z) => P B (North(North(z))) 

P n (z) => P n (North(North(z))) 

P n (z) => J P n (North(North(z))) 
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2.3. Monadic Second- Order Logic 

This paper studies connection between subshifts (seen as structures as explained above) and monadic second order 
sentences. First order variables (x, y, z, ■■■) are interpreted as points of Z 2 and (monadic) second order variables (X, 
Y, Z, ...) as subsets of Z 2 . 

Monadic second order formulas are defined as follows: 

• a term is either a first-order variable or a function (South, North, East, West) applied to a term ; 

• atomic formulas are of the form t \ = ?2 or X{t\) where t\ and ?2 are terms and X is either a second order variable 
or a color predicate ; 

• formulas are build up from atomic formulas by means of boolean connectives and quantifiers 3 and V (which 
can be applied either to first-order variables or second order variables). 

A formula is closed if no variable occurs free in it. A formula is FO if no second-order quantifier occurs in it. A 
formula is EMSO if it is of the form 

3X u ...,3X n ,<f>(X) 

where (p is FO. Given a formula <f>(X\, . . . ,X n ) with no free first-order variable and having only X\,...,X n as free 
second-order variables, a configuration M together with subsets E\, . . . , E„ is a model of (f>{X\, . . . , X n ), denoted 

(M,E\, E n ) \= (f>(Xi, . . .,X n ), 

if (p is satisfied (in the usual sense) when M is interpreted as 9Jt (see previous section) and Ej interprets X,. 

2.4. Definability 

This paper studies the following problems: Given a formula (p of some logic, what can be said of the configurations 
that satisfy (f>l Conversely, given a subshift, what kind of formula can characterise it? 

Definition 1. A set S of Q- configurations is defined by <p if 

S = {M eQ z2 \<m\=ct>} 

Two formulas cfi and <p' are equivalent iff they define the same set of configurations. 
A set S is C-definable if it is defined by a formula <p e C. 

It is easy to see that Example 1 is defined by the formula 

Vx, -n (P a (x) A P B (East(x))) 



or equivalently by the formula 



Vx, -n (P B (x) A P\ !(North(x))) 
Vx, -. (P n (x) A f B (East(x)) A f n (North(x))) 

: Vx, P a (x) <=^ (f B (East(x)) A P B (North(x))) 



We will see some variants of formula <p' appear in a few theorems below. 
Example 2 is defined by the formula 

if,: Vjcy.^MA/^Cy)) => * = y 

Note that a definable set is always closed by shift (a shift between 2 configurations induces an isomorphism between 
corresponding structures). It is not always closed: The set of ^-configurations defined by the formula <p ■ 
3z, P^fz) contains all configurations except the all-white one, hence is not closed. 



When we are dealing with MSO formulas, the following remark is useful: second-order quantifiers may be repre- 
sented as projection operations on sets of configurations. We formalize now this notion. 

If 7r : Q\ k> Qi is a projection and S is a set of Q i -configurations, we define the two following operators: 

E(n)(S) = [Me(Q 2 f 2 \3Ne(Qif-,7:(N)^MANeS} 

A(n)(S) = {Me(Q 2 f 2 \\/Ne(Q ] f 2 ,n(N) = M ^ N e s} 

Note that A is a dual of E, that is A(n)(S ) = c E(n)( c S ) where c represents complementation. 

Proposition 1. 

• A set S of Q-configurations is EMSO -definable if and only if there exists a set S' of Q' configurations and a 
map n : Q' i— > Q such that S — E(tt)(S ') and S ' is FO-definable. 

• The class of MSO- definable sets is the closure of the class of FO-definable sets by the operators E and A. 
Proof (Sketch). We prove here only the first item. 

• Let <p = 3X, if/ be a EMSO formula that defines a set S of Q-configurations. Let Q' = Q x {0, 1} and n be the 
canonical projection from Q' to Q. 

Consider the formula if/ obtained from ip by replacing X(t) by V C EQP^,\){t) and P c (t) by P( C ,o)(0 V P( C ,\)(t)- 

Let S' be a set of Q' configurations defined by if/' . Then is it clear that S = E(n)(S'). The generalization to 
more than one existential quantifier is straightforward. 

• Let S = E(7i)(S') be a set of Q configurations, and S' FO-definable by the formula (f>. Denote by c\ . . .c„ the 
elements of Q'. Consider the formula <p' obtained from <p where each P c . is replaced by X,. Let 



ij/ = 3Xu...,3X n 



Vz, VjXjfc) 

Vz, A w (-#i(z) V -.Xjiz)) 



Then ^ defines 5 . Note that the formula if/ constructed above is of the form 3X\, . . . , 3X„(Vz, (fr'(z)) A 0'. This 
will be important later. □ 

Second-order quantifications will then be regarded in this paper either as projections operators or sets quantifiers. 

3. Hanf Locality Lemma and EMSO 

The first-order logic has a property that makes it suitable to deal with tilings and configurations: it is local. This is 
illustrated by Hanf 's lemma [20, 21, 22]. A square pattern of radius n is a pattern of domain [— n, n] x [n, n] 

Definition 2. Two Q-configurations M and N are (n, k)-equivalent if for each Q-square pattern P of radius n: 

• If P appears in M less than k times, then P appears the exact same number of times in M and in N 

• If P appears in M more than k times, then P appears in N more than k times 

This notion is indeed an equivalence relation. Given n and k, it is clear that there is only finitely many equivalence 
classes for this relation. 

The Hanf 's local lemma can be formulated in our context as follows: 

Theorem 2. For every FO formula (p, there exists (n, k) such that 
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if M and N are (n, k) equivalent, then 9JT |= <p <=> ^Jl |= <p 



Corollary 3. Every FO-definable set is a (finite) union of some (n, k)-equivalence classes. 

This is theorem 3.3 in [14], stated for finite configurations. Lemma 3.5 in the same paper gives a proof of Hanf's 
Local Lemma in our context. 

Given (P, k) we consider the set S = k(P) of all configurations such that the pattern P occurs exactly k times (k may 
be taken equal to 0). The set S>k(P) is the set of all configurations such that the pattern P occurs more than k times. 

We may rephrase the preceding corollary as: 

Corollary 4. Every FO-definable set is a positive combination (i.e. unions and intersections) of some S =k(P) and 
some S>k(P) 

Theorem 5. Every EMSO-definable set can be defined by a formula (p of the form: 

3Xi,...,3X„,(Vzi, cf >1 (zi,X 1 ,...,X n )) 

A (3zi, . . . , 3z p , Mzi ...z p ,X u ..., X n )), 

where (pi and <p2 are quantifier-free formulas. 

See [1, Corollary 4.1] or [23, Corollary 4.2] for a similar result. This result is an easy consequence of [24, Theorem 
3.2] (see also the corrigendum). We include here a full proof. 

Proof. Let C be the set of such formulas. We proceed in three steps: 

• Every EMSO-definable set is the projection of a positive combination of some S = k{P) and S>k(P) (using prop. 
1 and the preceding corollary) 

• Every S = (P, k) (resp. S>(P, k)) is C-definable 

• C-definable sets are closed by (finite) union, intersection and projections. 

C-definable sets are closed by projection using the equivalence of prop. 1 in the two directions, the note at the end of 
the proof and some easy formula equivalences. The same goes for intersection. 

Now we prove that C-definable sets are closed by union. The difficulty is to ensure that we use only one universal 
quantifier. Let <p and <p' be two C-formulas defining sets S\ and S 2- We can suppose that <p and <p' use the same 
numbers of second-order quantifiers and of first-order existential quantifiers. 

Then the formula 



defines S \ U S2 (the disjunction is obtained through variable X which is forced to represent either the empty set or 
the whole plane Z 2 ). 

It is now sufficient to prove that a S = k(P) set (resp. a S>k(P) set) is definable by a C-formula. Let <f>p(z) be the 
quantifier-free formula such that 4>p(z) is true if and only if P appears at position z. 
Then S = k(P) is definable by 



3X,3X u ...,3X n ,\fzu 



X(zi) <=> X(North(zi)) 
X(zi) <=> X(Eastfei)) 
X(zi) => Mzi,Xi...X n ) 
-nX(zi) =^ <t>' l {zi,X l ...X n ) 




z P ,X, 
z P ,Xi 



x n ) 

X n ) 



3Xi ...3Xit3Ai,...,HA*,Vje< 



' AMx) <=> [A i (North(x))AA,(East(x))] 
AjXj(jc) <^> [Ai(x) A -.A,(South(;c)) A -.A,(West(x))] 

A^jXi(x) => -nXj(x) 

(ViXi(x)) <=> Mx) 



A3zi,...,3Zk,X 1 (zi)A---AX k (z k ) 



7 



The formula ensures indeed that A, represents a quarter of the plane, X t being a singleton representing the corner of 
that plane. If k = this becomes Vx, -Kp P (x). To obtain a formula for S>k(P), change the last <==> to a => in the 
formula. □ 



4. Characterization of Subshifts of Finite Type and Soflc Subshifts 

4.1. Subshifts of Finite Type 

We start by a characterization of subshifts of finite type (SFTs, i.e tilings). The problem with SFTs is that they 
are closed neither by projection nor by union. As a consequence, the corresponding class of formulas is not very 
interesting: 

Theorem 6. A set of configurations is a SFT if and only if it is defined by a formula of the form 

Vz, if/(z) 

where if/ is quantifier-free. 

Note that there is only one quantifier in this formula. Formulas with more than one universal quantifier do not always 
correspond to SFT: This is due to SFTs not being closed by union. 

Proof. Let Pi ... P„ be patterns. To each P, we associate the quantifier-free formula 0p,.(z) which is true if and only 
if Pj appears at the position z. Then the subshifts that forbids patterns Pi ... P„ is defined by the formula: 

Vz, -^fe) A • • • A -10/»„( Z ) 

Conversely, let if/ be a quantifier-free formula. Each term f, in if/ is of the form f(z) where f is some combination 
of the functions North, South, East and West, each f thus representing somehow some vector z,- (f(z) = z + z,). Let 
Z be the collection of all vectors z, that appear in the formula if/. Now the fact that if/ is true at the position z only 
depends on the colors of the configurations in points (z + z\), . ■ ■ , (z + z„), i.e. on the pattern of domain Z that occurs 
at position z. Let P be the set of patterns of domain Z that makes if/ false. Then the set S defined by if/ is the set of 
configurations where no patterns in P occurs, hence a SFT. □ 

4.2. Universal sentences 

Due to the way subshifts are defined, universal quantifiers play an important role. We now ask the following 
question: what are the sets defined by universal formulas? First the following lemma shows that we can restrict to 
first-order when considering universal formulas. 

Lemma 7. Any universal formula is equivalent to a first-order universal formula. 

Proof. A universal formula is equivalent (through permutation of universal quantifiers) to a formula of the form 

Sxi,...,x p ,SX u ...,X n , <D(Xi ,...,X n ,x u ...,x p ) 

where <E> is quantifier- free. Consider the formula 

if/(Xi , . . . , X„_!, xu . . . , x p ) = VX„, <D(Xi ,...,X n ,xi,...,x p ) 

Let {t i , . . . , tt] be the set of terms t such that X n (t) occurs in <I>. The idea is that the truth value of 0(Xi , . . . , X n , x\, . . . , x p ) 
depends only on the value of X n at positions represented by the (tf). Depending on interpretations of the variables 
(Xi), interpretations of the terms (/,) may be equal or not. We say an assignation p : {1, . . . , k] — > {0, 1 j is sound if 
f, = tj => p(i) = p(j). Denote by (f> p (xi, . ..,x p ) the quantifier-free formula expressing this condition: 

<f>p(xi,...,x p ) = tj±tj. 

{(ijy. P mpU)\ 
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Let tff p denote the formula <f>[X„(tj) <— p(i)] obtained from cD be replacing each occurrence of X„(tj) by the truth 
value p(z) and this for each i e {1, . . . ,k}. For any fixed x\, . . . , x p , the truth value of VX„<f>(Xi, . . . ,X n ,x\, . . . , x p ) is the 
same as the truth value of the conjonction of formulas \p p for all sound p. Hence, we get that \p(X\ , . . . , X n -\ , x\ , . . . , x p ) 
is equivalent to the following quantifier-free formula: 

A & =» <v 

p.n A-Mo.n 

We can eliminate this way second order universal quantifiers one by one and the lemma follows. □ 

For the rest of this section we focus on first-order universal formulas. The real difficulty is to treat the equality 
predicate (=). Without the equality (more precisely if all predicates and functions are only unary) any first-order 
universal formula is equivalent to a conjonction of formulas with only one quantifier and theorem 6 applies. The 
equality predicate intertwines the variables and makes thing a bit harder to prove. The reader might for example try 
to understand what the following formula exactly means: 

Vx,y, (P n (x) A P B (East(y))) => x = y 

To understand it, we will prove an analog of Hanf 's Lemma for universal sentences. 

Definition 3. Let (n, k) be integers, and M, N two Q-configurations. We say that M N if for each Q-square 
pattern P of radius less than n: 

• If P appears in M exactly p times and p < k, then P appears less than p times in N 

• (No condition is required if P appears in M more than k times) 

Note that M and N are (n, k) equivalent if and only if M N and N >„,* M. 

Theorem 8. For every universal formula <p there exists (n, k) such that if M > n ^ N, then 97t |= cf> OT |= <p 

Compare with definition 2 and theorem 2. Note that Gaifman's Theorem (a more refined version of Hanf's lemma) 
was generalized in [25] to existential sentences. We may use this result to obtain ours. This would however add some 
unnecessary complications. 

Proof. We will translate the usual proof of Hanf's Local Lemma into our special case. We will try as much as possible 
to use the same notations as [21, sec. 2.4]. 

We first change the vocabulary and consider that East, West, North, South are binary predicates rather than func- 
tions. Note that every universal formula will remain a universal formulas, albeit with more quantifiers. 

Let introduce some notations. Let S(r,a) be the set of all points at distance less than r of a. That is S(r,a) = 
{x : \x - a\ < r) where | • | is the Manhattan distance. Note that S(r,a) contains e r = 2r 2 + 2r + 1 points. Let 
S(r,a\ ... a p ) — U,-5(r, a,). 

Let M and be two Q-configurations. We say that a\ . . .a p e (7?) p and b\ . . . b p e (Z 2 ) p are ^-isomorphic if 
there exists a bijective map / from S (3 k , a\ . . . a p ) to S (3 k , b\ . . . b p ) that preserves the relations, that is 

• x East y <=> f(x) East f(y) 
. P c (x) PAf(x)) 

• /(a,-) = bj. 

It is then clear that if a\ ...a p and b\ . . .b p are O-isomorphic, then we have 9Jt |= if/(ai . . .a p ) <=> 0^ |= 
i[/{bi . . . b p ) whenever iff is quantifier-free. 

Now take a formula (f> - Vxi . . . x n \j/{x\ . . . x n ) where i[i is quantifier- free. 
Let M and N such that M >y JWv ,+\ N. 
We now prove by induction that 



if a\ . . . a p and b\ . . .b p are (« - /?)-isomorphic, then for all b p+ \, there exists a p+ \ such that a\ . . . a p +\ and b\ . . . b p+ i 

are (n - p - l)-isomorphic. 

• Case p — 0. Let b\ e Z 2 . Consider the pattern of radius 3" centered around b\ in N. This pattern appears in N, 
hence must appear in M at least one time. Take a\ to be the center of this pattern. 



- Case 1: \b p+l - b t \ < 2 x 3" -p-1 for some b h 

In this case S(3 n ~ p , b p +{) c S(3"~ p , b,). Hence by taking a p +\ = f (b p +{) where / is the bijective map 
involved in the n — p isomorphism, it is clear that a\ . . . a p+ \ and b\ . . . b p+ \ are n — p — 1 isomorphic. 

- Case 2: Vz, | b p+1 -b t \ >2x3"^'. In this case for every i, S(3"- p -\ b p+ i) n S(3 B- ^ _1 ,fe i ) = 0. 
Consider the pattern P of radius 3" centered around fe^+i. 

This pattern appears a times inside 5(2 x 3" _/ ' _1 ,fei . . .b^) where a < pe^yyt-p-i. P appears at least 
a + 1 times in and a + 1 < «e3» + 1 hence must appears at least a + 1 times in M. As it appears the 
same amount of time in S (2x3"~ p ~ l , b\ . . . b p ) and S(2 x3" _/ ' _1 , a\ . . . a p ) (by n — p isomorphism), it must 
appear somewhere else, say centered in a p+ \ . This a p+ \ is not inside S (3" _p_1 , a\ . . . a p ) because otherwise 
it would be the center of an occurrence of pattern P inside 5(2 x 3 n ~ p ~ l ,a\ . . .a p ). As a consequence, 
a\ . . . a p+ i and b\ . . . b p +\ are n — p — 1 isomorphic. 

Now suppose that 9JI |= <p. Take b\ . . . b„ e I?. There exists a\ . . . a n such that a\ . . . a„ and b\ . . . b n are 0- 
isomorphic. As Wl \= (p the quantifier-free formula ifr(ai . . . a„) is true in DJl. As a consequence t//(bi . . . b n ) is true in 
y\. As this is true for all b\ . . . b n we obtain 71 \= <p. □ 

Given (P, k) we consider the set S<k(P) of all configurations such that the pattern P occurs at most k times (k may 
be taken equal to 0) 

Corollary 9. A set is definable by a universal formula if and only if it is a positive combination (i.e. unions and 
intersections) of some S <k(P)- 

Compare to corollary 4. 

Proof. Let C be the class of all universal formulas. It is clear that the set of C-defined formulas is closed under 
intersection and unions. 

Now S<k(P) is defined by 



For k — 0, this becomes Vx, ^<pp(x). Hence, every positive combination of some S <k(P) is C-definable. 

Conversely, let <p be a universal formula and S the set it defines. Let (n, k) be as in the theorem. 

For each configuration M e S and P a pattern of radius less than n, denote 4>m(P) the number of times P appears 
in M with the convention than <Pm(P) — oo'ifP appears more than k times in M. 

Consider the set 



From the hypothesis on (n, k), we have S m £ S . It is then easy to see that S = UmS m where the union is actually 



• Case p i-> p + 1, Let a\ . . . a p and b\ . . . b p be n - p isomorphic. Let b p+ \ e 1? . 





radius(P)<« 



finite (two configurations that are (n, ^-equivalent give the same S m). 



□ 
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4.3. Sofic subshifts 

Using the previous corollary, we are now able to give a characterisation of sofic subshifts: 

Theorem 10. A set S is a sofic subshift if and only if it is definable by a formula of the form 

3X U . . . , 3X„, V Zl , . . ., V Zp , ifr(X u . . .,X„, Zl ...z P ) 

where if/ is quantifier-free. Moreover, any such formula is equivalent to a formula of the same form but with a single 
universal quantifier (p — 1 J. 

See [19] for a different proof that eliminates equality predicates one by one. 

Proof. Let C be the clas of all formulas of the form 

3X 1 ,...,3X n ,s/z^(X 1 ,...,X n ,z) 

where if/ is quantifier- free. With the help of theorem 6 and proposition 1 , is is quite clear that C-defined sets are exactly 
sofic subshifts. 

Let T> be the class of all formulas of the form 

3X U . . . , 3X„, V Zl . . .z,MX u . ■ .,X n ,zi ...Z P ) 

where if/ is quantifier- free. The previous remark states that sofic subshifts are .©-defined. 

Now we prove that D-defined sets are sofic subshifts. Using (the proof of) proposition 1, and the fact that sofic 
subshifts are closed under projection, it is sufficient to prove that universal formulas define sofic subshifts. Using 
corollary 9 and the fact that sofic subshifts are closed under union and projections, it is sufficient to prove that every 
S <k(P) is sofic. 

Now S <k(P) is defined by 



*:3Si... 




Vx,ViSi(x) <=> 4>p(x) 



where expresses that 5,- has at most one element and is defined as follows: 

W d - f ^A V / A(x) A ( North W) A A(East(x)) 

- JA,vxj s ^ ^ A(x) A -iA(South(x)) A -iA(West(x)) 

Now with some light rewriting we can transform (p into a formula of the class C, which proves that S<k(P) is 
C-definable, hence sofic. □ 



5. (E)MSO-deflnable subshifts 

5.1. Separation result 

Theorems 5 and 10 above suggest that EMSO-definable subshifts are not necessarily sofic. We will show in this 
section that the set of EMSO-definable subshifts is indeed strictly larger than the set of sofic subshifts. The proof is 
based on the analysis of the computational complexity of forbidden languages. It is well-known that sofic subshifts 
have a recursively enumerable forbidden language. The following theorem shows that the forbidden language of an 
MSO-definable subshift can be arbitrarily high in the arithmetical hierarchy. 

This is not surprising since arbitrary Turing computation can be defined via first order formulas (using tilesets) 
and second order quantifiers can be used to simulate quantification of the arithmetical hierarchy. However, some care 
must be taken to ensure that the set of configurations obtained is a subshift. 



Theorem 11. Let E be an arithmetical set. Then there is an MSO-definable subshift with forbidden language f such 
that E reduces to T {for many-one reduction). 
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Proof (sketch). Suppose that the complement of E is defined as the set of integers m such that: 



3x\, V*2, . . . , 3/Vx„,R(m, x\, . . . , 



where R is a recursive relation. We first build a formula tf> defining the set of configurations representing a successful 
computation of R on some input m, x\, . . . ,x„. Consider 3 colors ci, c and c r and additional second order variables 
X\ , . . . , X n and Si,...,S„. The input (m, x\,..., x n ) to the computation is encoded in unary on an horizontal segment 
using colors c\ and c r and variables S , as separators, precisely: first an occurrence of q then m occurrences of c, then 
an occurrence of c r and, for each successive 1 < i < n, x, positions in X t before a position of 5,-. Let <p\ be the FO 
formula expressing the following: 

1 . there is exactly 1 occurrence of c; and the same for c r and all S , are singletons; 

2. starting from an occurrence q and going east until reaching S„, the only possible successions of states are those 
forming a valid input as explained above. 

Now, the computation of R on any input encoded as above can be simulated via tiling constraints in the usual way. 
Consider sufficiently many new second order variables Yu . . . , Y p to handle the computation and let fa be the FO 
formula expressing that: 

1 . a valid computation starts at the north of an occurrence of c t ; 

2. there is exactly one occurrence of the halting state (represented by some F,) in the whole configuration. 
We define (f> by: 



Finally let ip be the following FO formula: (Vz, ->P C ,) V (Vz, ->P Cr ). Let X be the set defined by (p V ip. By construction, 
a finite (unidimensional) pattern of the form C[C m c r appears in some configuration of X if and only if m £ £. Therefore 
E is many-one reducible to the forbidden language of X. 

To conclude the proof it is sufficient to check that X is closed. To see this, consider a sequence (C n )„ of configu- 
rations of X converging to some configuration C. C has at most one occurrence of c; and one occurrence of c r . If one 
of these two states does not occur in C then C e X since i/r is verified. If, conversely, both c/ and c r occur (once each) 
then any pattern containing both occurrences also occurs in some configuration C n verifying (p. But (f> is such that any 
modification outside the segment between q and c r in C n does not change the fact that <f> is satisfied provided no new 
a and c r colors are added. Therefore <f> is also satisfied by C and C e X. □ 

The theorem gives the claimed separation result for subshifts of EMSO. 

Corollary 12. There are EMSO -definable subshifts which are not sofic. 

Proof. In the previous theorem, choose E, to be the complement of the set of integers m for which there is x such that 
machine m halts on empty input in less than x steps. E is not recursively enumerable and, using the construction of 
the proof above, it is reducible to the forbidden language of an EMSO-definable subshift. □ 

5.2. Definability of MSO-subshifts 

As we saw before, sets defined by MSO-formulas are not always subshifts. We will try in this section to find a 
fragment of MSO that contains only subshifts and contain all of them. This fragment is somewhat ad hoc. Finding a 
more reasonable fragment is an interesting open question. 

We first begin by a definition 



3X l ,VX 2 ,...,3/\/X n ,3Su...,3S n ,3Y u ...,3Y p ,<p 1 A 2 . 



Definition 4. 



fin(S) : 3A,3B< 



\fx,A(x) ^ A(North(x)) A A(East(x)) 
Mx,B{x) <=> A(South(x)) A A(West(x)) 
3x,A(x) A -^A(South(x)) A -.A(Wesf(x)) 
3x,B(x) A -nB(North(x)) A -nB(East(x)) 



Vx,S(x) => A(x)AB(x) 
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It is easy to prove that fin(S ) is true if and only if S is finite (there are finitely many x such that S (x)). Indeed A and B 
represent quarter of planes, and S must be contained in the square delimited by the two quarter of planes. Any other 
formula true only if S is finite would work in the following 

Theorem 13. Let S be a MSO-definable set. Then S is a subshift if and only if it is definable by a formula of the form 
VS,fin(S) => 3B l ...B k ,i(,(S,B l ...B k ) AVx l ...x n S(x l ) A...S(x p ) => 9(S, B, . . . B k , x\ . . . x p ) 

where 

• tff is any MSO-formula not containing the predicates P c . 

• 9 is quantifier-free. 

Note that this formula can be written more concisely as 

^ fin S,3Bil/(S,B) A VI e S P ,9(S,B,J) 



Proof. First we prove that such a formula <p defines a subshift X . For this, we prove that the set X is closed. Consider 
a sequence M\ . . . M„ ... of configurations of X converging to some configuration M. We must prove that M e X. 

LetS be a finite set. Now consider the formula 9. As it is quantifier-free, it is local: the value of 9(S, B\ . . .B k ,x\ . . .x n ) 
depends only of what happens around x\ . . . x„. As each x\ . . . x„ must be in S , there exists a finite S ' 3> S such that 
the value of Vx; e S . . .x„ e S,9(S,Bi . . . B k , x\ . . . x„) depends only of the value of the predicates S, P c and Bj on S '. 

Now Mj converges to M. This means that there exists p such that M p and M coincides on 5". For this M p , there 
exists some B\ . . .B k such that we have Wl p |= ifr(S,Bi . . . B k ) A Vxi e S . . . Vx^ e S, 9(S, B\ . . . B k , x\ . . . x n ). Then 
this formula is also true on 9JI (Note indeed that i[/(S, B\ . . . B k ) does not depend on the configuration). 

Hence we have found for every S some B, that makes the formula true, that is we have proven 9JI |= <p. Therefore 
X is closed, hence a subshift. 

Now let X be a MSO-definable subshift. X is defined by a formula (p. Change each P c in by a predicate B c to 
obtain \jf\. Define 



ip(B) = Vx 

Then X is defined by 



\j B c (x) A -<(B c {x) A B c >(x)) 



cf> : V fin S, 3Bxjj(B) A Vx eS,/\ (B c (x) ^=> P c (x)) 

c 

Indeed M satisfies (f> and only if every pattern of M is a pattern in some configuration of X. □ 



6. A Characterization of EMSO 

EMSO-definable sets are projections of FO-definable sets (proposition 1). Besides, sofic subshifts are projections 
of subshifts of finite type (or tilings). Previous results show that the correspondence sofic<->EMSO fails. However, we 
will show in this section how EMSO can be characterized through projections of "locally checkable" configurations. 

Corollary 4 expresses that FO-definable sets are essentially captured by counting occurrences of patterns up to 
some value. The key idea in the following is that this counting can be achieved by local checkings (equivalently, by 
tiling constraints), provided it is limited to a finite and explicitly delimited region. This idea was successfully used 
in [14] in the context of picture languages: pictures are rectangular finite patterns with a border made explicit using a 
special state (which occurs all along the border and nowhere else). We will proceed here quite differently. Instead of 
putting special states on borders of some rectangular zone, we will simply require that two special subsets of states Qq 
and Qi are present in the configuration: we call a (Qo, Q\)-marked configuration any configuration that contains both 
a color q € Qo and some color q' e Q\ somewhere. By extension, given a subshift X over Q and two subsets Qo c Q 
and Q\ c g, the doubly-marked set Xg^g, is the set of (Qo, Qi)-marked configurations of 2. Finally, a doubly-marked 
set of finite type is a set Xq 0i q, for some SFT X and some Qo, Q\. 
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Figure 4: The rectangular zone in dark gray defined by predicate Z(z). 

Lemma 14. For any finite pattern P and any k >0, S = k(P) is the projection of some doubly-marked set of finite type. 
The same result holds for S >k(P). 

Moreover, any positive combination (union and intersection) of projections of doubly-marked sets of finite type is 
also the projection of some doubly-marked sets of finite type. 

Proof (sketch). We consider some base alphabet Q, some pattern P and some k > 0. We will build a doubly-marked 
set of finite type over alphabet Q' — Q x Q+ and then project back on Q. Q+ is itself a product of different layers. The 
first layer can take values {0, 1 , 2} and is devoted to the definition of the marker subsets Qo and Q\ : a state is in Q, for 
i e {0, 1 } if and only if its value on the layer is i. 

We first show how to convert the apparition in a configuration of two marked positions, by Qo and Qi, into a 
locally identifiable rectangular zone. The zone is defined by two opposite corners corresponding to an occurrence of 
some state of Qo and Q\ respectively. This can be done using only finite type constraints as follows. By adding a new 
layer of states, one can ensure that there is a unique occurrence of a state of Qo and maintain everywhere the following 
information: 

1. Nq (z) = the position z is at the north of the (unique) occurrence of a state from Qo, 

2. Eq (z) = the position z is at the east of the occurrence of a state from Qo- 

The same can be done for Q\. From that, the membership to the rectangular zone is defined at any position z by the 
following predicate (see figure 11): 

Z(z) = N Qo (z) * N Qi (z) A E Qo (z) * E Qi (z). 

We can also define locally the border of the zone: precisely, cells not in the zone but adjacent to it. Now define 
P(z) to be true if and only if z is the lower-left position in an occurrence of the pattern P. We add k new layers, each 
one storing (among other things) a predicate C,(z) verifying 

Q(z) => Z(z) A P(z) A /\ -,C/z). 

Moreover, on each layer i, we enforce that exactly 1 position z verifies C,(z): this can be done by maintaining 
north/south and east/west tags (as for Qo above) and requiring that the north (resp. south) border of the rectangular 
zone sees only the north (resp. south) tag and the same for east/west. Finally, we add the constraint: 

P(z) A Z(z) => \/ Q 

i 

expressing that each occurrence of P in the zone mut be "marked" by some C,. Hence, the only admissible (Qo, Q\)- 
marked configurations are those whose rectangular zone contains exactly k occurrences of pattern P. We thus obtain 
exactly S>k(P) after projection. To obtain S = u(P), it suffices to add the constraint: 



P(z) =* Z(z) 
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in order to forbid occurrences of P outside the rectangular zone. 

To conclude the proof we show that finite unions or intersections of projections of doubly-marked sets of finite 
type are also projections of doubly-marked sets of finite type. Consider two SFT X over Q and Y over Q' and two 
pairs of marker subsets Qq, Q\ c Q and Q' , Q\ c Q' . Let n\ : Q — > A and 712 '■ Q! — > A be two projections. 

First, for the case of union, we can suppose (up to renaming of states) that Q and Q' are disjoint and define the 
SFT S over alphabet Q U Q as follows: 

• 2 adjacent positions must be both in Q or both in Q'; 

• any pattern forbidden in X or Y is forbidden in S. 

Clearly, ^q uq' ,q 1 uq',) = 7i x (X QoA ) U ^(Tq-,^) where n(q) is m{q) when and n 2 {q) else. 
Now, for intersections, consider the SFT 2 over the fiber product 

Qx = {(?,?') e ex e>i(?) = *2(?')} 

and defined as follows: a pattern is forbidden if its projection on the component Q (resp. Q') is forbidden in X (resp. 
Y); 

If we define n as 7Ti applied to the g-component of states, and if E is the set of configuration of S such that states 
from Qo and Q\ appear on the first component and states from Q' {) and Q' { appear on the second one, then we have: 

7r(£) = ^i(X eo , ei )U7r 2 (F e; , e; ). 

To conclude the proof, it is sufficient to obtain E as the projection of some doubly-marked set of finite type. This can 
be done starting from S and adding a new component of states whose behaviour is to define a zone from two markers 
(as in the first part of this proof) and check that the zone contains occurrences of go, Q\, Q' {) and Q[ in the appropriate 
components. □ 

Theorem 15. A set is EMSO- definable if and only if it is the projection of a doubly-marked set of finite type. 

Proof. First, a doubly-marked set of finite type is an FO-definable set because SFT are FO-definable (theorem 6) and 
the restriction to doubly-marked configurations can be expressed through a simple existential FO formula. Thus the 
projection of a doubly-marked set of finite type is EMSO-definable. 

The opposite direction follows immediately from proposition 1 and corollary 4 and the lemma above. □ 

At this point, one could wonder whether considering simply-marked set of finite type is sufficient to capture EMSO 
via projections. In fact the presence of 2 markers is necessary in the above theorem: considering the set 'Lq ,q 1 where 
L is the full shift Q 1 ' and Qq and Q\ are distinct singleton subsets of Q, a simple compactness argument allows to 
show that it is not the projection of any simply-marked set of finite type. 

7. Open Problems 

• Is the second order alternation hierarchy strict for MSO (considering our model-theoretic equivalence)? 

• One can prove that theorem 6 also holds for formulas of the form: 

VXi...VX„,Vz, tfr(z,X 1 ...X n ) 

where \]/ is quantifier-free. Hence, adding universal second-order quantifiers does not increase the expression 
power of formulas of theorem 6. More generally, let C be the class of formulas of the form 

VXi, 3X 2 , \//3X„, Vzi, .. ., V Zp , <f>(X u . . .,X„,zu- ■ -,z P ). 

One can check that any formula in C defines a subshift. Is the second-order quantifiers alternation hierarchy 
strict in C? On the contrary, do all formulas in C represent sofic subshifts ? 
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